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Abstract 

This article proposes a method to quantitatively measure the resilience of transportation systems using GPS 
data from taxis. The granularity of the GPS data necessary for this analysis is relatively coarse; it only 
requires coordinates for the beginning and end of trips, the metered distance, and the total travel time. The 
method works by computing the historical distribution of pace (normalized travel times) between various 
regions of a city and measuring the pace deviations during an unusual event. This method is applied to 
a dataset of nearly 700 million taxi trips in New York City, which is used to analyze the transportation 
infrastructure resilience to Hurricane Sandy. The analysis indicates that Hurricane Sandy impacted traffic 
conditions for more than five days, and caused a peak delay of two minutes per mile. Practically, it identifies 
that the evacuation caused only minor disruptions, but significant delays were encountered during the post¬ 
disaster reentry process. Since the implementation of this method is very efficient, it could potentially be 
used as an online monitoring tool, representing a first step toward quantifying city scale resilience with coarse 
GPS data. 


1 Introduction 

1.1 Motivation 

In recent years, resilience of city infrastructure has gained a great deal of attention pQ. When disasters and 
other extreme events occur, infrastructure may fail, incurring large human, economic, and environmental 
costs. This is especially relevant for transportation infrastructure, since it is crucial for city evacuations 
and emergency services in post-disaster environments. Methods are needed to quantitatively monitor the 
transportation infrastructure in terms of its ability to withstand and recover from such events. Measuring the 
performance of city-scale infrastructure with traditional traffic sensors is cost-prohibitive due to relatively 
high installation costs, but many cities already have taxi fleets equipped with GPS sensors. Though this 
analysis could be performed with any GPS data, taxi data is publicly available in some cases. The New York 
City dataset used in this analysis gives interesting insights about the performance of infrastructure during 
Hurricane Sandy and other major events. 

The goal of this article is to develop and implement a method for measuring resilience of city-scale 
transportation networks using only taxi datasets. The technique is designed with the following characteristics: 

1. The method can be applied at the city-scale, or larger. Extreme events such as hurricanes have 
the ability to affect an entire city. For this reason, it is important to examine impacts at a high-level 
city view, rather than the level of individual vehicles or streets. 
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2. The method measures network performance quantitatively, in terms of recovery time 

and peak pace deviations. Recovery time and peak performance degradation are fairly standard 
quantities of interest in the resilience literature While travel times are a natural performance 

measure for transportation networks, we instead use pace , or travel time per mile. This normalization 
accommodates the varied length of taxi trips within a city. 

3. The method accommodates the inherent variability in traffic conditions and data. The 

available data is full of noise and depends on many unmodeled human factors. As a result, the method 
evaluates events that cause statistically significant disruptions, in order to separate the signal from the 
noise. 

4. The method is computationally tractable. Since taxi trips occur very frequently in large cities, 
the amount of data available for analysis is large. In order to be tractible, the computation should be 
O(N), where N is the number of taxi trips, and ideally require only one pass through the raw data. Of 
practical significance, these single-pass algorithms could also be used to process the data in a realtime 
stream. 

1.2 Related Work 

In recent years, the study of resilience has gained popularity in the systems engineering community. Haimes 
0I3U5] gives a framework for assessing resilience, which focuses on modeling a system and the possible 
outcomes of various events. He asserts that a resilient system should suffer only slight degredation during an 
event, then rapidly recover. Reed et al. [5] note that the quality of service abruptly drops during an event, 
then exponentially decays back to typical values. They suggest that an appropriate resilience measure is the 
integral of this exponential curve. Authors in the related field of risk analysis emphasize the importance of 
unknown factors while assessing resilience mm- 

Though there is no precise consensus on the definition of resilience, peak disruption and recovery time are 
consistently discussed quantities. In other words, peak disruption measures how far the quantity of interest 
deviates from typical values, and recovery time measures how long it takes to return to typical values. Most 
of these works also emphasize that resilience must be measured with respect to a given event and quantity of 
interest. For example, one case study used the number of functioning nodes in a power grid as the quantity 
of interest, assessing resilience against hurricanes and minor events [8]. This article will follow this standard 
in the sense that it will use GPS data to measure the resilience of a transportation network with respect to 
specific events. No claims are made about the overall resilience of the network. 

Several authors have proposed quantities of interest for transportation systems. Orner et al. [5j proposed 
a method which measures the resilience of a road-based transportation network in terms of travel times 
between cities. Chang et al. [101 evaluated a post-earthquake transportation network in terms of accessibility 
and coverage. This is partly based on an accessibility metric devised by Allen et al. El, which considers 
travel times between various regions of a city. Thus, travel time is a standard quantity on which to measure 
resilience. This article will use the related quantity of pace, or travel time per mile. 

A distinct set of studies use large amounts of data to extract useful information about urban systems. 
The work most closely related to resilience is a study by He and Liu [12] . which uses loop detector data 
to measure the effect of the I-35W bridge collapse in Minneapolis in 2007. Geroliminis et al. [T3] use loop 
detector data, combined with 500 GPS vehicles to extract macroscopic traffic properties from an urban-scale 
transportation network. Other works use GPS traces of mobile devices to analyze movement patterns of 
crowds during typical days and atypical events [HI TBj. Castro et al. m present a method for inferring 
current and future traffic states from taxi GPS data. Zheng et al. E propose a method that tracks taxi trips 
between various regions of a city and identifies flawed urban planning. Another study measures temporal 
patterns in the density of taxi pickups and dropoffs to identify the social function of various city regions 
E- They point out that unusual output can be used to detect events like holidays. Chen [l]Jj specifically 
focuses on identifying anomalous taxi trajectories, in order to detect fraud or special events. Ferreira et al. 
[20] created a graphical querying tool which can be used to count taxi trips between arbitrary geometrical 
regions as a function of time. They noted the drop in the frequency of taxi trips during Hurricane Sandy 
and Hurricane Irene, pointing out that the Irene-related drop was more significant, but the Sandy-related 
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drop was longer lasting. By examining pace, we confirm that Hurricane Sandy had a longer recovery time, 
but find the contrasting result that Hurricane Sandy also has a more significant peak disruption. 

1.3 Outline and Contributions 

The contributions of this work are as follows. In Section[2j a method is proposed to use taxis as pervasive city- 
scale resilience sensors. This method detects unusual events and measures them in terms of peak disruption 
and recovery time. It introduces paces between regions of the city as the key performance measure, and 
it uses the historical pace distribution to detect and measure extreme events. In Section [3j the method is 
applied to a four-year dataset from New York City to identify and compare properties of events such as 
Hurricane Sandy. Of practical significance, the analysis identifies the relative efficiency of the pre-Sandy 
evacuation, contrasted with the gridlock of post-Sandy reentry. Conclusions and future work is summarized 
in Section |4j As a technical contribution, all code m and data [22] used in this analysis are made publicly 
available. 


2 Methodology 

2.1 Overview 

The proposed technique to measure city-scale resilience of the transportation network in response to various 
events by examining taxi trip data is done in three steps. In section [2~2| individual taxi trips are aggregated 
by origin-destination pairs in order to measure typical paces between various regions of the city. This 
aggregation technique makes it possible to extract city-scale features at various points in time, since it is 
difficult to measure resilience from individual trips. Section [2.3| imposes a one-week periodic pattern on the 
paces, defining the mean and variance of paces for each hour of the week. Finally, Section [274] uses these 
distributions to quantify how typical or atypical the pace is at a particular point in time. Atypical paces 
(e.g., the 5% most unlikely points) are flagged as events, and they are examined in more detail. 


2.2 Extraction of Time-Series Features from Aggregated Trips 

In the first stage of analysis, trips are grouped by their geographic locations and times of occurrence. More 
specifically, the city is divided into a small number, k, of large regions. This allows each taxi trip to be 
labeled as one of k 2 unique origin-destination pairs. Time is discretized into hours, so a large sample of trips 
can be gathered at any point in time. The start zone, end zone, and departure time are used to partition 
all of trips into subsets. The variable T)j it denotes the set of all trips from zone i to zone j at time t: 


= {r\o(r) e z(i), d(r) € z(j), [s(r)J = t} , (1) 

where o(r) is the origin of trip r, d(r) is the destination of trip r, z(i) is the geographic region of zone 
*, and [s(r)J is the start time of trip r, rounded down to the hour. It is assumed that i and j are both 
in {0,1, - - - , fc — 1}. Once these subsets of trips are defined, macroscopic traffic features can be extracted 
from them. Of particular interest is the expected travel time between two regions. However, travel times of 
individual vehicles between two regions are not uniform, due to the varying lengths of trips that connect the 
same regions. Much of this variation can be accounted for by normalizing against distance. In this way, the 
average pace is computed for each trip subset T it j t . Trips are weighted by their distance, since longer trips 
give more information about the state of traffic. In this way, the distance-weighted average pace, 
of taxis from zone i to zone j at time t is computed: 
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where u{r) is the travel time of trip r, l(r) is the metered length of trip r, and p(r) = is the pace of trip 
r. For a fixed value of t, all k 2 distance-weighted average paces collectively form the mean pace vector , a. 
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This vector is a function of time, and contains the k 2 pace values at a particular point in time. Specifically, 
the nth element of a ft) is given by 


a(f) n = P 



, n mod k, t 


)• 


(3) 


where n £ {0,1, 2, • • • , k 2 — 1}. 

It is desirable to use pace as the performance metric instead of the more traditional measure of vehicle 
counts, since the goal is to measure traffic conditions during extreme events. If the flow of vehicles between 
two regions drops significantly, it is difficult to determine whether this is due to increased congestion or 
decreased demand. However, an increase in pace indicates congestion, while a decrease in pace indicates 
decreased demand. Although the pace of taxis might be a biased estimate of the pace of all vehicles, logic 
dictates that if taxi drivers are stuck in traffic jams, so are the other vehicles around them. 


2.3 Identification of City-Scale Typical Behavior 

The mean pace vector, a ft), has a strongly periodic weekly pattern. During rush hour, the pace is high, 
especially in dense downtown regions, and at night the pace is low. On weekends, the rush hour is less 
extreme. However, the mean pace vector has some variance around this periodic pattern, so it is viewed as a 
distribution conditioned on time. For example, the mean pace vector for all Tuesdays at 3pm will be slightly 
different, and significantly different during an unusual event. To facilitate this grouping, the reference set 
Qt is defined for all times t. This set contains all of the mean pace vectors which occur at the same point 
in the periodic pattern as a(f), except for aft) itself. Intuitively, when deciding how typical the traffic data 
is at time t , that data should not be used as part of the definition of typical. Since there are 168 hours in a 
week, the reference set can be defined as 


Qt = {a(h)\h = t mod 168, h ^ t}. (4) 

The reference set Qt makes it possible to compute the expected value of the mean pace vector ^.ft) as well 
as the covariance matrix X(f). This covariance matrix is important because it quantifies the noisy day-to-day 
fluctuations in the mean pace vector, outside of the event at hand, and how the dimensions correlate. The 
time-dependent sample mean and covariance matrices can be defined as: 


“ \Qt\ E 
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(5) 


If an independence assumption is desired, the diagonal components of these matrices can be extracted. 
However, it is likely that many of the fc 2 dimensions of aft) are highly correlated, so the full covariance 
matrix is used for the remainder of the analysis. For example, trips that start or end in the same region 
often have highly correlated paces. Together, ^(t) and E(f) make it possible to identify unusual mean pace 
vectors. 


2.4 Detection of Deviations from Typical Behavior 

Intuitively, /r(f) captures the expected traffic conditions at a particular point in time. If the observed traffic 
conditions are significantly far from this expectation, then those conditions are classified as an extreme event. 
The covariance matrix £(f) is also considered; if there is typically very little deviation from /r(f), then a 
large deviation is even more extreme. In one dimensional cases, this is typically addressed by standardizing 
the data via a z-score. In higher dimensions, the generalized z-score is called the Mahalanobis distance . 
For this analysis, the Mahalanobis distance for an observed mean pace vector is viewed as a function of the 
time that the observation occurred: 

Mft) = yj (a (t) - /r(t)) T S(t)- 1 (a(t) - fift)). (6) 
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Event Detection - Thrashing 


Original Events 



Merge Nearby Events 



Figure 1: Demonstration of event detection. Events are detected when M(t) goes above the threshold, but 
thrashing often occurs. The top graph shows that this thrashing causes events to be divided into several 
pieces. For this reason, events with fewer than six hours between them are merged, as shown in the bottom 
graph. 


This time-dependent Mahalanobis distance serves as an outlier score for observations at various points 
in time. Note that it normalizes the deviations in each dimension by the corresponding variances, and 
also considers correlations between dimensions. The Mahalanobis distance is a natural way of measuring 
outliers in multivariate normal data, and it has shown to be useful even when the data is not normal 
In fact, the multivariate generalization of Chebyshev’s inequality gives an upper bound on the probability 
of observing a Mahalanobis distance greater than some fixed value 1251 . In other words, it is unlikely to 
observe a datapoint with a high Mahalanobis distance, regardless of the distribution. So, when M(t) rises 
above a given threshold, an unusual event is detected. The event is declared complete when M(t) returns 
to a value lower than the threshold. In this work, the choice of the threshold is the 95% quantile of M(t), 
but this value can easily be lowered to detect smaller events or raised to detect only the most severe events. 
The function M(t ) is a fairly noisy, which means that it can occasionally thrash over the threshold. In other 
words, M(t) may rise above the threshold, then immediately drop back below it, effectively breaking the 
event into two pieces. To prevent this, consecutive events separated by fewer than six hours are merged. 
Figure |T| illustrates this process. 

Once the recovery time of an event is computed, other properties can be computed. For example, it 
is possible to compute the maximum pace deviation, or the slowest type of trip during the event. Thus, 
each event can be described with a set of meaningful statistics. Comparisons between various events make it 
possible to describe which types of events the city can easily endure, and where there is room for improvement. 
For longer-lasting events like Hurricane Sandy, it is possible to examine different stages of the event in greater 
detail. 
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pickup 

datetime 

dropoff 

datetime 

duration 

(sec) 

distance 

(mi) 

pickup 

Ion 

pickup 

lat 

dropoff 

Ion 

dropoff 

lat 

2013-05-01 

00:02:11 

2013-05-01 

00:14:28 

737 

2.9 

-74.00 

40.74 

-74.01 

40.71 

2013-05-01 

00:02:12 

2013-05-01 

00:12:31 

618 

1.8 

-74.00 

40.73 

-73.98 

40.72 

2013-05-01 

00:02:12 

2013-05-01 

00:07:39 

326 

1.3 

-73.97 

40.76 

-73.96 

40.77 

2013-05-01 

00:02:13 

2013-05-01 

00:04:35 

141 

0.6 

-73.99 

40.75 

-74.00 

40.75 

2013-05-01 

00:02:14 

2013-05-01 

00:04:09 

115 

0.5 

-73.98 

40.75 

-73.99 

40.74 


Table 1: A small subset of the data used in this analysis. Each row corresponds to an occupied taxi trip. 


3 Application to Hurricane Sandy with New York City Taxi Data 

In this section, the previously described methodology is applied to a dataset of New York City taxi trips. 
This dataset, which was obtained through a Freedom of Information Law (FOIL) request, covers four years 
of operation and details nearly 700 million trips. Many events are detected within this four year span and 
compared quantitatively. Special attention is given to Hurricane Sandy and some interesting properties are 
discovered. 

3.1 The Dataset 

The data used in this analysis takes the form of a large table where each row represents a single taxi trip. 
Table [l] gives a small sample of this data. Note that this data format is the minimum amount of information 
required to perform the analysis. Other datasets may contain, for example, periodic GPS updates, but this 
is at least as much information as the New York City data. As there are several entries per second for four 
years, the raw data takes up about 116GB in text CSV format. We have made this large dataset publicly 
available [ 221 . 

Note that this data only records trips where the taxi is occupied by a passenger. Non-occupied trips are 
not recorded. The dataset also contains a large number of errors. For example, there are several trips where 
the reported meter distances are significantly shorter than the straight-line distance, violating Euclidean 
geometry. Additionally, many trips report GPS coordinates of (0,0), or contain impossible distances, times, 
or velocities. All of these types of obvious errors are discarded and account for roughly 7.5% of all trips. 

After removing errors, the dataset is then filtered to remove data outside of the scope of the analysis. 
For example, there are many trips which start in Midtown, travel over 50 miles, then end less than a 
block from their starting points. These trips are entirely possible, but unlikely to be representative of 
Midtown-to-Midtown trips because they likely drove many miles in other areas. This filter is implemented 
by thresholding the winding factor , or metered distance over straight-line distance. Trips which last less than 
60 seconds are also unlikely to give accurate pace estimates because the initial non-driving time becomes 
more important. These types of trips are also removed, accounting for roughly 4% of the original data. 
Figure [2] shows histograms of all trip features considered for filtering, as well as the thresholds used for 
invalid data. Additionally, the entire months of August and September 2010 were discarded due to a high 
number of errors. 
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Data Filtering 


Latitude 
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Figure 2: Distributions of individual features of taxi trips. Simple thresholds are used to filter trips that 
contain errors, or are otherwise uninformative. Note that the straightline distance is the Euclidean distance 
between start and end coordinates, while the metered distance is the value reported by the taximeter. The 
winding factor is the metered distance divided by the straightline distance. A winding factor less than 1 is 
geometrically impossible, and a large value indicates that the taxi did not proceed directly to its destination. 
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3.2 Computational Issues 


Due to the size of the dataset, an efficient software implementation of the analysis is crucial. This section 
discusses the algorithmic and practical aspects of the analysis, using the New York City taxi dataset as an 
example. In this way, concrete figures can be used for quantities like runtime or data size. More general 
concepts like time complexity do not depend on the dataset. 

The first step described in Section 2.2 is the most computationally expensive. All of the 697,622,444 
individual trips are aggregated into 35,064 mean pace vectors - one for each hour in the four-year dataset. 
Since the trip data is sorted chronologically, it is possible to compute these mean pace vectors in a single 
pass. Recall from § that the mean pace computation involves the sum of trip durations and the sum of 
trip distances. Thus, these two sums are initialized to zero for each of the 16 types of trips. Each time a 


trip is read from the file, the relevant sums are incremented. The error filtering from Section 3.1 can also 
be performed at this stage, so an additional pass of the dataset is not required. When the start hour of 
the current trip (rounded) is greater than the start hour of the previous trip, the sums are complete for the 
previous hour. The mean pace vector is computed by division and output, then the sums are reset to zero. 
Thus, the computation is one large loop over the entire dataset. A short pseudocode is given in Algorithm]!] 
Note that NUM_TYPES is 16, since there are four regions. 


Algorithm 1 Online Mean Pace Vector Extraction 

prev hour := — 1 

> Start at beginning of time 

surmduration := zeros(NUM.TYPES) 

> Initialize sums to 0 

sum_distance := zeros(NUM_TYPES) 

> Initialize sums to 0 

for all trip £ chronologicaLtrips do 

> Loop over all trips 

while trip.hour > prev_hour do 

> If previous hour is complete: 

output(prevJrour, 

> Output mean pace vector 

sum duration := zeros(NUM_TYPES) 

> Reset sums to 0 

suimdistance := zeros(NUM_TYPES) 

> Reset sums to 0 

prev_hour+= 1 

> Advance to next hour 

end while 

if trip, is Valid () then 

> Data filtering 

i 4— category(trip.pickup, trip.dropoff) 

[> Determine trip type 

sum_duration[i] += trip.duration 

> Update distance sum 

sum_distance[«] += trip.distance 

> Update duration sum 

end if 


end for 



Since each trip is accessed only once, the computation is 0(N ), where N is the total number of trips. 
The computation of each hour timeslice is independent, making it possible to employ parallel processing if 
the data is partitioned ahead of time. The analysis was implemented in Python (source code available at 
m) and run on an 8-core 2.5GHz machine with 24GB of RAM. The extraction of all 35,064 mean pace 
vectors took about 75 minutes, using roughly 40MB of RAM for each of the eight processes. The fact that 
the runtime is much shorter than the real timespan of the dataset combined with the single-pass property 
means that this preprocessing could be performed in realtime. In other words, this system could realistically 
collect trips as they occur, update the relevant sums, then output the mean pace vector at the end of the 
hour. 

The remaining computations involve mean pace vectors instead of raw trip data. They also have linear 
time complexity and are much faster than the preprocessing. Recall from Q and ([5]) that, at a particular 
hour, the mean and covariance need to be computed for all hours in the periodic pattern except that hour. 
The naive implementation of this calculation has a quadratic time complexity, since each mean pace vector 
much be compared against every other mean pace vector in the group. However, it is possible to compute 
all of these quantities in linear time. Instead of directly computing the mean of all values except A(t), the 
sum of all values including A(t) is computed up front. Then, in the loop, A(t) is subtracted from this sum. 
Formally, the inclusive reference set , Qt+, is defined in a similar way to Q, except that it includes the mean 
pace vector A (t). In other words, 








Qt+ = {A(/i)|ft. = t mod 168} = Q t U {A(t)}. (7) 

Unlike the reference set from Q, the inclusive reference set is identical for values of t that occur at the 
same point in the periodic pattern. Thus, Q t + and the sum of all vectors in Q t+ only need to be computed 
once. To compute the sum of all vectors in Q t + except A(£), it is sufficient to subtract A (t) from this sum. 
Thus, the mean computation can be written as 


f ‘ W= ]QJ £ A = 


A GQt 



-A (i) 


( 8 ) 


A similar technique is used for the sum of outer products in the covariance computation. This method 
avoids redoing most of the addition in each iteration, allowing for a significant improvement on large datasets. 
Once /i(t) and X(t) are computed, M(t) can be computed in constant time. Thus, the entire operation runs 
in linear time. On the same machine, this computation ran in less than 10 seconds, producing the timeseries 
of M(t). Again, this operation would be feasible in a real-time system. However, it is worth noting that it 
may be desirable to re-generate old values of M(t) in light of new information. 

Once M(t) is generated, the event detection described in Section 2.4 can also be performed in linear 
time. Events and spaces between events are stored as a linked list, where each node contains the start time 
and end time. Scanning through M(t) chronologically, a new node in the linked list is generated each time 
M(t) crosses above or below the threshold. Then, to remove short spaces between events, this linked list is 
iterated upon. Each time a non-event node of less than the desired duration is discovered, that node and 
its two neighbors are replaced with one larger node. On the same machine as the previous computations, it 
took less than one second to perform the event detection. 


3.3 Extraction of Pace Features 


The map of New York City is first split into four large regions, shown in Figure [3] For the remainder of 
the analysis, the zones will be referred to in the following way: Upper Manhattan (U), Midtown (M), Lower 
Manhattan (L), and East of the Hudson River (E). Note that the Eastern region is connected only by bridges 
and tunnels and thus problems with this infrastructure will tend to increase travel times between this region 
and others. Specifically relevant to Hurricane Sandy is the Lower Manhattan region, since it experienced 
severe flooding and power outages. Choosing four large regions in this way satisfies the first goal outlined in 
Section [Li] because it defines meaningful city-scale properties. Instead of looking at every street in New York 
under a microscope, it defines iarge areas with key geographic and infrastructural properties. The travel 
times between these regions reflect the overall performance of city-scale transportation infrastructure. It is 
worth noting that the methodology allows for an arbitrary choice of regions. This implementation simply 
chooses zones which are useful for detecting the types of events that occur in New York City. 

Recall that a taxi can take one of 16 possible trips between these regions. Aggregating these trips by 


type and hour as in Section 2.2 produces the 16-dimensional mean pace vector, A(i), at all points in time. 
Figure [4] shows three typical weeks of mean pace vectors, revealing the expected periodic pattern. 


3.4 Analysis of Events 


As detailed in Section 2.3, the expected behavior is generated for all times t according to pit) and E(t). An 
interesting way to view the mean pace vector A (ft) is by standardizing it, element by element, producing the 
standardized pace vector. The ith element of this vector is given by 


s = A(t) M* )i ( 9) 

Intuitively, the standardized pace vector tells how many standard deviations away from the mean the pace 
of each category of trips is at time t. In other words, it is possible to identify the trips that are going slower 
or faster than expected, and how significant this difference is. Figure [5] shows the standardized pace vector 
during the week of Hurricane Sandy. This figure gives some intuition on the behavior of various regions of the 
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Figure 3: Division of New York City into four large regions denoted U,M,E, and L. A random sample of 
0.01% of the taxi trips in 2012 are shown. Pickup locations are marked in green, and the corresponding 
dropoffs are marked in red. The majority of trips occur in Manhattan, with especially high concentration in 
the Midtown region. 
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Mean Pace Vector - Three Typical Weeks 
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Figure 4: The mean pace vector, a (t) for three typical weeks, starting on April 4, 2010. A periodic pattern 
is observable, with high paces during rush hour. 
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Standardized Pace Over Time - Week of Hurricane Sandy 



Standardized Pace (Z-Score) 
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Figure 5: The standardized pace vector during the week of Hurricane Sandy, 2012. Labels are included to 
show the times of specific phases of the event [26] . An average week would have values of zero everywhere, 
but significant deviations are shown during the week of Hurricane Sandy. Missing data (hours where there 
are less than five occurrences of a given trip) are marked with black Xs. 


city during and after the hurricane. It also includes labels indicating the occurrences of various phases of the 
event, obtained from a post-Hurricane Sandy study from NYU |26j . Standardizing each origin-destination 
pace separately allows for additional insight beyond the Mahalanobis distance. 

The most notable finding is that the slowest traffic occurred on Wednesday October 31st, almost two 
days after the hurricane struck land. On this day, some airports, buses, and commuter rails attempted to 
resume normal service, but much of the infrastructure was still damaged [26j . It is even more surprising 
that Midtown-to-Lower Manhattan and Lower Manhattan-to-Lower Manhattan travel times are significantly 
lower than expected during this time. The pace of these trips remains almost five standard deviations below 
the mean until Saturday the third, despite the severe flooding and power outages in Lower Manhattan. The 
fact that a hurricane can actually make traffic move faster in some areas of the city indicates that the usage 
of the infrastructure changed. It is likely that the hurricane decreased demand on the transportation network 
in Lower Manhattan until the infrastructure began to recover. 

This standardized pace vector gives a meaningful interpretation of unusual travel times between various 
regions of the city, but it fails to account for correlations between these typical travel times i.e., the off- 
diagonal elements of S(f). In contrast, the Mahalanobis distance M(t) considers the full covariance matrix. 
As described in Section [2A{ events are detected when M(t) goes above a threshold for a significant period 
of time. Figure [6] shows this process, along with the average pace of all taxis. Table [2] shows the top ten 
events, sorted by duration. At the top of the list is Hurricane Sandy, taking over five and a half days for 
travel times to return to normal. This is over three times the recovery time of Hurricane Irene. This agrees 
with the results of 123. which showed that the total number of Manhattan taxi trips returned to normal 
more quickly during Hurricane Irene than Hurricane Sandy. At its worst, Sandy added over two minutes to 
each mile driven by taxis in the city, while Irene added less than forty seconds. This is in contrast to the 
results of |20j . which showed that the peak drop in the number of taxi trips was greater during Hurricane 
Irene. The blizzard of December 2010, while shorter, added four minutes of travel time to each mile at its 
peak. 

It is difficult to evaluate the accuracy of the results in Table [2j since the true severity of each event is 
not known. If a training set of events is available, one could raise or lower the detection threshold until the 
desired balance between type I and type II errors is reached. 
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Event Detection 
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Figure 6: Probabilistic detection and measurement of the event Hurricane Sandy. The Mahalanobis distance, 
M(t ), is plotted in the top figure and events are detected when it goes below the threshold. For comparison, 
the average pace of all taxis in the city is plotted below and compared to the expected value. Green areas 
indicate that travel times are low, but red indicates that they are unusually high. 


Event 

Start Time 

Duration 

(hours) 

Max 

(min/mi) 

Min 

(min/mi) 

Worst Trip 

Sandy 

2012-10-28 

21:00:00 

132 

2.25 

-1.6 

E —>• M 

Blizzard 

2010-12-26 

13:00:00 

112 

4.41 

0.33 

M -» M 

Blizzard 

2011-01-31 

08:00:00 

49 

2.04 

0.34 

E -> E 

Irene 

2011-08-27 

13:00:00 

43 

0.64 

-1.66 

E —> E 

Unknown 

2013-10-12 

03:00:00 

33 

1.09 

0.08 

E —5> L 

Blizzard 

2013-02-08 

06:00:00 

26 

1.54 

-0.58 

E —> E 

Blizzard 

2010-02-10 

06:00:00 

24 

0.67 

-1.01 

E —> E 

New Years 

2012-12-31 

15:00:00 

20 

1.42 

-2.66 

E -> M 

Unknown 

2011-09-09 

08:00:00 

19 

1.66 

0.35 

U -> U 

Blizzard 

2011-01-28 

02:00:00 

18 

2.57 

0.49 

L —> L 


Table 2: Comparison of New York City transportation infrastructure resilience to the 10 longest events. 
The duration in hours, and the maximum/minimum pace deviation in minutes/mile is given for each event. 
Note that a positive number indicates a delay while a negative indicates a decreased pace. The final column 
indicates which of the 16 trips most frequently had the highest standardized pace during the event. Labels 
for events (the first column) are determined manually (cf. [27jh 
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4 Conclusion 


This analysis has shown that it is possible to detect and measure the effects of unusual events on transporta¬ 
tion infrastructure using only taxi GPS data. This is a first step toward assessing and improving city-scale 
resilience. Of key importance, the method is extremely low cost, because it does not require the installation 
of any additional sensors. This method proposes computing origin-destination paces , or average travel time 
per mile between various regions of the city. The effects of various events are quantified by the sizes and du¬ 
rations of pace deviations from typical values. Importantly, this measurement considers the typical statistics 
of traffic conditions, so significant events can be distinguished from random day-to-day fluctuations. 

The proposed method is applied to a dataset from New York City, and Hurricane Sandy is analyzed in 
detail. The analysis shows this was the longest event in the four year dataset, and one of the most severe in 
terms of peak pace deviation. At its worst, Hurricane Sandy caused over two minutes of delay per mile, but 
actually resulted in faster traffic for most of its duration. Most interestingly, the spike in delay occurred two 
days after the hurricane struck, as many residents migrated back into the city. This re-entry process was 
extremely slow when compared to the evacuation process before the hurricane, suggesting that more traffic 
management might be necessary following an event. The analysis of an extreme event like Hurricane Sandy 
demonstrates the ability of the proposed method to capture and describe atypical city-scale properties of 
the transportation network. 


5 Future Work 

This research is ongoing, and leaves several opportunities for improvement. For example, in section |2.2[ 
regions are chosen manually. Naturally, one may ask how the results will change when different regions are 
chosen, or when a different number of regions are used. This question can be answered empirically by trying 
several different partitioning schemes. It may be possible to automatically define regions via clustering as in 
[28] . On a related note, more GPS data is clearly required when more regions are used, in order to gain an 
adequate sample for each origin-destination pair. 

It is also possible to apply the outlier-detection methods to other types of paces. For example, instead 
of measuring paces between various origin-destination zones, one may desire to compute approximate paces 
on each link of the network graph. Algorithms exist which can estimate link travel times, for example (29] 
and |30j . but they are computationally expensive. If the same outlier-detection methods are applied to 
link-level pace data, it is possible to examine whether such a heavy computation is necessary. If the results 
are unchanged, the simpler method presented in this article may be sufficient. 
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